Skip to content

feat(unofficial): mock 950dt + huawei-mindie for unofficial runs#257

Open
Oseltamivir wants to merge 12 commits into
masterfrom
claude/huawei-950dt-mindie-unofficial
Open

feat(unofficial): mock 950dt + huawei-mindie for unofficial runs#257
Oseltamivir wants to merge 12 commits into
masterfrom
claude/huawei-950dt-mindie-unofficial

Conversation

@Oseltamivir

@Oseltamivir Oseltamivir commented Apr 27, 2026

Copy link
Copy Markdown
Contributor

Summary

Unblocks loading the Mock-ascend run 25014782858 via the unofficial-run viewer (?unofficialrun=25014782858).

The mock artifacts use:

  • hw: 950dtnew GPU, Huawei Ascend 950DT (new vendor)
  • framework: huawei-mindienew framework
  • model: deepseek-ai/DeepSeek-V4-Pro — already supported (dsv4)
  • precision: fp4, sequences 8K/1K, 1K/1K — already supported

Without these mappings, hwToGpuKey('950dt') returns null and rows are silently skipped by the ETL → the unofficial-run viewer shows nothing.

Changes

  • packages/constants/src/gpu-keys.ts
    • HW_REGISTRY['950dt'] — vendor Huawei, arch Ascend, label Ascend 950DT, sort 9. TDP/power/cost set to placeholder 9.99 — Huawei figures are not yet finalized; bump to real values when available.
    • New huawei color zone in VENDOR_OKLCH_ZONES (amber 60–90°) and VENDOR_HSL_ZONES (amber 30–60°). To make room in the HSL wheel, AMD's lower band shrinks from 0–60 to 0–30 and NVIDIA's band shrinks from 60–195 to 90–195. AMD/NVIDIA still get >90° each.
  • packages/constants/src/framework-aliases.ts
    • FW_REGISTRY['huawei-mindie'] → label Huawei MindIE.
  • packages/app/src/lib/dynamic-colors.ts
    • Vendor type now includes 'huawei'; getVendor() recognizes the Huawei vendor string.
  • packages/app/src/lib/chart-utils.ts
    • BANNED_HUE_TEST / PREFERRED_ZONE huawei entries — avoids AMD-red (320–40°) and NVIDIA-green (120–195°) in CIELab; prefers amber/yellow (50–95°).

Notes for review

  • This is added solely so the unofficial-run viewer can render the mock data — it is not intended to ingest mock benchmarks into the production DB. The mock artifacts use tput_per_gpu: 510.2041, median_ttft: 0.0, etc. — fake values.
  • Cost/power/TDP are placeholders; once real Huawei pricing/specs are known, replace 9.99 and tune the chart sort order.

Test plan

  • pnpm typecheck
  • pnpm lint
  • pnpm test:unit (1896 tests pass across all packages)
  • On Vercel preview, load ?unofficialrun=25014782858 — confirm 950dt overlay points render under DeepSeek-V4-Pro at 1K/1K, 8K/1K, etc., with conc 16/64/128.

🤖 Generated with Claude Code


Note

Medium Risk
Touches chart data keys, legend reset, URL state, and unofficial-run loading across several providers; behavior is well covered by unit tests but preview validation for the target run is still important.

Overview
Adds Huawei Ascend 950DT (950dt) and Huawei MindIE to shared registries with placeholder power/cost/TDP and new Huawei vendor color zones (OKLch/HSL and CIELab helpers), so Mock-ascend CI artifacts map through ETL instead of being dropped.

The unofficial-run API now accepts bmk_* per-config artifacts when results_bmk is missing (concatenates rows), and sequence handling gains 8k/256 and 8k/625 via islOslToSequence instead of the old 1k/8k-only prefix hack.

Inference charts get multi-sequence overlay (MultiSequenceSelector, i_seq_extra, __seq<compact> synth hwKeys), a “Show as ingested” mode that merges unofficial points into the main scatter pipeline (mergeUnofficialIntoOfficial, i_uoff_ingested), and legend/color/reset fixes so new synth series stay visible. Branch-only defaults auto-pick 8K/256 and output throughput per GPU when an unofficial run loads without URL pins.

Reviewed by Cursor Bugbot for commit c9c31a7. Bugbot is set up for automated code reviews on this repo. Configure here.

Adds Huawei Ascend 950DT GPU and Huawei MindIE framework so the Mock-ascend
GitHub Actions run (25014782858) can be loaded via the unofficial-run viewer.
TDP/power/cost are placeholder 9.99 — Huawei specs are not yet finalized.

- HW_REGISTRY: new 950dt entry, vendor=Huawei, arch=Ascend, sort=9
- VENDOR_OKLCH_ZONES / VENDOR_HSL_ZONES: amber/yellow zone for Huawei
  (steals 30-60° HSL from AMD's lower band; AMD keeps 300-360 + 0-30)
- FW_REGISTRY: huawei-mindie → "Huawei MindIE"
- dynamic-colors Vendor type extended with 'huawei'
- chart-utils BANNED_HUE_TEST/PREFERRED_ZONE entries for huawei (avoid
  AMD-red and NVIDIA-green clashes; preferred amber/yellow zone)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel

vercel Bot commented Apr 27, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
inferencemax-app Ready Ready Preview, Comment Jun 14, 2026 3:12am

Request Review

The Mock-ascend workflow uploads one bmk_<config>_conc<N>_<gpu>_<idx>
artifact per concurrency instead of a single aggregated results_bmk.
The unofficial-run API was filtering strictly on name === 'results_bmk',
returning 404 even though the per-config artifacts contained valid rows.

When results_bmk is absent, gather every artifact whose name starts with
'bmk_', download each, and concat the rows before normalization.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Oseltamivir Oseltamivir reopened this Apr 27, 2026
@Oseltamivir

Copy link
Copy Markdown
Contributor Author

@claude

Make it able to render 8192 input tokens, 256 output tokens, like from runs/25457884421

@github-actions

github-actions Bot commented May 6, 2026

Copy link
Copy Markdown
Contributor

Claude finished @Oseltamivir's task in 7m 56s —— View jobclaude/pr-257-20260506-2045Create PR ➔


Done — registered the 8K/256 sequence (8192/256)

Run 25457884421 is the InfiniteBench DSv4 sweep that emits isl=8192, osl=256. The unofficial-run path was silently dropping every row because islOslToSequence(8192, 256) returned null (and getModelAndSequenceFromArtifact would mis-map it to 8k8k due to a binary 1k-vs-8k branch).

What changed

  • packages/constants/src/models.ts — registered '8k/256'{isl: 8192, osl: 256} in both sequenceToIslOsl and islOslToSequence.
  • packages/app/src/lib/data-mappings.ts — added Sequence.EightK_256 = '8k/256' and a SEQUENCE_CONFIG entry (label '8K / 256', compact '8k256', category 'default'). Replaced the binary 1k-vs-8k ternary in getModelAndSequenceFromArtifact with a lookup against islOslToSequence, so future ISL/OSL pairs registered in constants are picked up automatically rather than silently mis-mapped.
  • Tests — added round-trip + label coverage in packages/constants/src/models.test.ts and a new getModelAndSequenceFromArtifact case for dsv4 + 8192/256 in packages/app/src/lib/data-mappings.test.ts.

The flow that now works for ?unofficialrun=25457884421:

  1. Artifact JSON ships isl: 8192, osl: 256mapBenchmarkRow passes through unchanged.
  2. buildChartData in unofficial-run-provider.tsx:113 calls islOslToSequence(8192, 256) → returns '8k/256' (was null).
  3. GlobalFilterContext availableSequences picks up '8k/256' from unofficialAvailable, the dropdown surfaces it, and useChartData filters rows by {isl: 8192, osl: 256}.

Checks

  • pnpm typecheck
  • pnpm lint ✅ (0 warnings, 0 errors)
  • pnpm test:unit — app: 1732/1732 ✅, constants: 22/22 ✅. The 3 failures in packages/db/src/connection.test.ts (postgres TLS for localhost) are pre-existing on master, unrelated to this change — verified by git stash && pnpm --filter db test:unit.

Not done

  • No browser verification — I don't have GitHub artifact access from this sandbox to actually load ?unofficialrun=25457884421 against the dev server, so I can't visually confirm the overlay points render. Once the Vercel preview is up, please load that URL and confirm overlay points appear under DeepSeek-V4-Pro at the new 8K / 256 sequence option. AGENTS.md flags this as a mandatory step for inference-path features.

Todos

  • Find where ISL/OSL sequences are registered
  • Inspect run 25457884421 to confirm ISL/OSL
  • Add 8192/256 to sequenceToIslOsl / islOslToSequence
  • Add Sequence.EightK_256 + SEQUENCE_CONFIG entry
  • Generalize getModelAndSequenceFromArtifact
  • Update tests
  • typecheck / lint / unit
  • Commit and push

· branch: claude/pr-257-20260506-2045

@Oseltamivir Oseltamivir reopened this May 6, 2026
@Oseltamivir

Copy link
Copy Markdown
Contributor Author

@claude

Make it able to render 8192 input tokens, 256 output tokens, like from runs/25457884421

Use this PR/branch, don't create a new branch

@github-actions

github-actions Bot commented May 6, 2026

Copy link
Copy Markdown
Contributor

Claude finished @Oseltamivir's task in 4m 23s —— View job


Done — registered 8K/256 sequence (8192/256) on this PR branch

Pushed commit f1549ed to claude/huawei-950dt-mindie-unofficial. Sweep run 25457884421 ships isl=8192, osl=256; islOslToSequence previously returned null and the rows were silently dropped by the unofficial-run viewer.

Changes

  • packages/constants/src/models.ts — added '8k/256'{isl: 8192, osl: 256} in both sequenceToIslOsl and islOslToSequence.
  • packages/app/src/lib/data-mappings.ts — added Sequence.EightK_256 = '8k/256' and a SEQUENCE_CONFIG entry (label '8K / 256', compact '8k256', category 'default'). Replaced the binary 1k-vs-8k ternary in getModelAndSequenceFromArtifact with islOslToSequence, so future ISL/OSL pairs registered in constants are picked up automatically rather than silently mis-mapped (the old ternary would have mapped osl=2568k, producing a wrong 8k1k sequence key).
  • Tests — added round-trip + label coverage in packages/constants/src/models.test.ts and packages/app/src/lib/models-mapping.test.ts, plus a new getModelAndSequenceFromArtifact case for dsv4 + 8192/256 in packages/app/src/lib/data-mappings.test.ts.

The flow that now works for ?unofficialrun=25457884421:

  1. Artifact JSON ships isl: 8192, osl: 256mapBenchmarkRow passes through.
  2. unofficial-run-provider.tsx:113 calls islOslToSequence(8192, 256) → returns '8k/256' (was null).
  3. GlobalFilterContext availableSequences picks up '8k/256' from unofficialAvailable, the sequence dropdown surfaces it, and useChartData filters rows by {isl: 8192, osl: 256}.

Checks

  • pnpm typecheck — clean
  • pnpm --filter @semianalysisai/inferencex-constants test:unit — 22/22 passed
  • pnpm --filter @semianalysisai/inferencex-app test:unit — 1689/1689 passed
  • pnpm lint — 0 new errors. The 41 errors reported are pre-existing on master (verified by git stash && pnpm lint); they're all no-underscore-dangle violations in files I didn't touch.

Not done — needs your verification

  • No browser verification. The dev server in this sandbox can't fetch GitHub artifacts, so I can't load ?unofficialrun=25457884421 and visually confirm the overlay points render. AGENTS.md flags Playwright verification as mandatory for inference-path features — please load the Vercel preview at ?unofficialrun=25457884421 once it deploys and confirm DSv4 overlay points appear under the new 8K / 256 sequence option.
    • Branch: claude/huawei-950dt-mindie-unofficial

Registers '8k/256' (isl=8192, osl=256) so unofficial-run overlays from
sweeps like InferenceX run 25457884421 render instead of being silently
dropped by islOslToSequence returning null.

- packages/constants/src/models.ts: add 8k/256 to both conversion maps.
- packages/app/src/lib/data-mappings.ts: add Sequence.EightK_256 with
  SEQUENCE_CONFIG entry (label "8K / 256"). Replace the binary
  1k-vs-8k ternary in getModelAndSequenceFromArtifact with islOslToSequence
  so future ISL/OSL pairs registered in constants are picked up
  automatically rather than mis-mapped.
- Tests added for round-trip + label coverage in models.test.ts /
  models-mapping.test.ts and a dsv4 + 8192/256 case in data-mappings.test.ts.

Co-authored-by: Bryan Shan <Oseltamivir@users.noreply.github.com>
@Oseltamivir

Copy link
Copy Markdown
Contributor Author

@claude

For unofficial runs, add a toggle where the dashboard treats them as ingested runs.

  • They actual names in the legend, if a job has multiple they're treated as separate lines
  • When this is toggled, the plots should be susceptible to the other toggles(such as "Optimal only", currently they don't.
  • Ensure everything works for multiple unofficial runs
  • Animation isn't necessary

@github-actions

github-actions Bot commented May 7, 2026

Copy link
Copy Markdown
Contributor

Claude finished @Oseltamivir's task in 24m 55s —— View job


Done — "Show as ingested" toggle for unofficial runs

Pushed 5e327a7 to claude/huawei-950dt-mindie-unofficial. When the toggle is on, unofficial-run rows are promoted into the same series pipeline as DB-ingested data, so they participate in Optimal-only, hardware toggles, precision filters, and per-roofline grouping.

How it works

  • Each (run, original hwKey) pair is rewritten to a synth key ${origHwKey}__uorun${runId} (e.g. b200_vllm__uorun25457884421). The GPU base prefix is preserved so getModelSortIndex / isKnownGpu keep working — only the suffix is new.
  • A synthesized HardwareEntry is generated for each synth key with label = "${origLabel} • ${branch}" so the legend shows e.g. B200 • infinitebench-dsv4-b200-20260506-125800 (vLLM).
  • A per-key color override (from the existing overlayRunColor palette) is threaded through InferenceContextScatterGraph's resolveColor so each run still gets its distinct color even after the GPU vendor system would normally re-paint by base GPU.
  • Multiple runs surface as separate legend entries; one job that benchmarks several GPUs surfaces those GPUs as separate legend entries too — (run, GPU) is the legend granularity.
  • The X-shape overlay layer is suppressed when merged, and stale .unofficial-overlay-pt / .overlay-roofline-path DOM is cleaned up on the toggle-flip (the D3 layer system has no built-in teardown for dropped layers).

What changed

  • packages/app/src/lib/unofficial-merge.ts (new) — mergeUnofficialIntoOfficial, makeSynthHwKey, parseSynthHwKey, isSynthHwKey, buildStubGraphsForMerge. Reuses processOverlayChartData so the merged points sit in the same x/y coordinate space the official pipeline uses.
  • packages/app/src/lib/unofficial-merge.test.ts (new) — 10 tests covering synth-key round-trips, no-op behavior when no overlay matches the (model, sequence), single-run merge, multi-run separation (same GPU under two runs gets two synth keys + two colors), preserving official rows, and stub-graph synthesis when official graphs is empty.
  • packages/app/src/components/unofficial-run-provider.tsx — adds mergeAsIngested + setMergeAsIngested to context. URL-synced via i_uoff_ingested=1 (with a hydration-safe useEffect re-sync).
  • packages/app/src/components/inference/InferenceContext.tsx — calls the merger after useChartData, exposes merged graphs/hardwareConfig and hwColorOverrides through context. Result memoized, so the toggle adds zero overhead when off.
  • packages/app/src/components/inference/ui/ScatterGraph.tsx — wraps resolveColor to consult hwColorOverrides first; adds the "Show as ingested" switch (only rendered when isUnofficialRun); cleanup effect removes stale overlay DOM when overlayData drops to null.
  • packages/app/src/components/inference/ui/ChartDisplay.tsx — sets overlayDataByChartType to null when mergeAsIngested is on, so the X-shape overlay layer doesn't render alongside the merged points.
  • packages/app/src/lib/url-state.ts — registers i_uoff_ingested as a known URL state key.
  • packages/app/src/components/inference/types.ts, packages/app/cypress/support/mock-data.ts — type-level wiring for hwColorOverrides and the new toggle fields.

Analytics: latency_unofficial_as_ingested_toggled fires on every flip via the existing track() convention.

Browser verification (Playwright)

Walked through the user-facing flow against localhost:3000:

  • ?unofficialrun=25457884421 (InfiniteBench DSv4 8K/256, B200 vLLM) — toggling on flips the 2 X-shape overlay points to ingested-style dot-groups with synth hwKey = "b200_vllm__uorun25457884421". Optimal-only is on; both points are Pareto-optimal so both render. Toggling off returns to overlay X-shapes (URL param drops too).
  • ?unofficialruns=25014782858 (Mock-ascend, DSv4 8K/1K) — synth hwKey = "950dt_huawei-mindie_mtp__uorun25014782858" co-renders with 374 official DSv4 points; no overlay X-shapes remain.
  • ?unofficialruns=25457884421,25014782858 — both run chips show in the banner; each contributes its own synth-keyed legend entry on the slices where it has data.
  • Reload with i_uoff_ingested=1 in the URL → toggle starts checked (state is hydration-safe, with a useEffect re-sync to handle the SSR default).

Checks

  • pnpm typecheck — clean
  • pnpm --filter @semianalysisai/inferencex-app test:unit — 1699/1699 passed (10 new tests in unofficial-merge.test.ts)
  • pnpm --filter @semianalysisai/inferencex-constants test:unit — 22/22 passed
  • pnpm lint — 0 new errors (the 41 reported are pre-existing on master, all no-underscore-dangle in files I didn't touch — verified by git stash && pnpm lint)
  • Browser-verified with Playwright per the cases above
    · branch: claude/huawei-950dt-mindie-unofficial

…ries

Adds a "Show as ingested" switch in the inference scatter legend (visible
when an unofficial run is loaded) that promotes overlay rows to first-class
points so they participate in the same filter pipeline as DB-ingested data:
hardware toggles, Optimal-only, precision filtering, and per-(hwKey, run)
roofline grouping.

Each (run, original hwKey) pair is rewritten to a synthesized hwKey of the
form `${origHwKey}__uorun${runId}` — preserving the GPU base prefix so
`getModelSortIndex` and `isKnownGpu` keep working — with a synthesized
HardwareEntry whose label embeds the run branch and a per-run color override
drawn from the existing overlay palette. Multiple runs and per-run multi-GPU
configs each surface as their own legend entry.

URL-synced via `i_uoff_ingested=1` so the state survives reloads and shares.
The X-shape overlay layer is suppressed and stale DOM is cleaned up when the
toggle flips on (the layer system has no built-in teardown for layers that
drop out of the array). Unit tests cover synth-key encoding, no-op behavior
when no overlay matches, and multi-run separation.

Co-authored-by: Bryan Shan <Oseltamivir@users.noreply.github.com>
@Oseltamivir

Copy link
Copy Markdown
Contributor Author

@claude

for the previous edit, the results look good, but continue to build on it:

  • I don't want the name "infinitebench-dsv4-b200-20260506-125800" to be in the legend
  • I want the colours to follow ingested runs, the B300 plots are showing up as the same red. They should be different shades of green for Nvidia

@github-actions

github-actions Bot commented May 7, 2026

Copy link
Copy Markdown
Contributor

Claude finished @Oseltamivir's task in 12m 4s —— View job


Done — bare GPU labels + vendor-zone colors for ingested unofficial rows

Pushed 323d1b4 to claude/huawei-950dt-mindie-unofficial. Both fixes live in packages/app/src/lib/unofficial-merge.ts.

What changed

  • Legend label drops the branch. makeSynthHardwareEntry previously produced "${baseLabel} • ${branch}"; now it returns just baseLabel (e.g. Ascend 950DT (Huawei MindIE, MTP) instead of ... • mock-ascend-950dt-deepseekv4-20260427-185322). The branch + run URL still surface in gpu, so the row tooltip retains run provenance.
  • No more per-run color override. The merge no longer populates colorOverrides[synthHwKey] = overlayRunColor(runIdx). Synth keys preserve the original GPU base prefix (b200_vllm__uorun123), so getVendor() recognizes them and generateVendorColors distributes a hue from the vendor's OKLCh zone — exactly the path official rows already use. That's why two NVIDIA GPUs from one unofficial run now get different shades of green rather than two copies of the same --overlay-run-0 red.

The wiring (hwColorOverridesScatterGraph.resolveColor) stays in place; the merge just returns an empty {} map. Future callers can still pin a specific color per synth key without further refactors.

Browser verification (Playwright)

Loaded ?unofficialruns=25457884421,25014782858&i_uoff_ingested=1 against localhost:3000, switched to DSv4 / 8K / 1K, hit "Reset filter" to expand the legend, and dumped the swatch styles:

  • Ascend 950DT (Huawei MindIE, MTP)oklch(0.64 0.18 75) — Huawei amber zone (60–90)
  • GB300 NVL72 (Dynamo SGLang)oklch(0.78 0.15 121.7) — NVIDIA green
  • B300 (vLLM)oklch(0.62 0.15 148.3)NVIDIA green, not red
  • B200 (vLLM)oklch(0.52 0.15 165) — NVIDIA green, distinct shade
  • MI355X (SGLang)oklch(0.5 0.22 34.5) — AMD red
  • MI355X (ATOM¹)oklch(0.78 0.22 19.5) — AMD red

Each NVIDIA GPU sits in the 121.7–168.3 hue band — a different shade per GPU, exactly the spec. No legend label contains "infinitebench-..." or "mock-ascend-..." anywhere. Switched to 8K / 256 to confirm the InfiniteBench overlay (B200 (vLLM)) also receives a green hue (oklch(0.64 0.15 145)) instead of the previous overlay-run red.

Tests

Added 3 integration cases to packages/app/src/lib/unofficial-merge.test.ts:

  • two NVIDIA synth keys (B200 + B300) from a single run get distinct colors;
  • vendor-zone hues stay inside their bands (NVIDIA 120–170, AMD 12–42 per VENDOR_OKLCH_ZONES);
  • two synth keys for the same GPU but different runs still get distinct hues.

Updated the existing label/color assertions to expect bare GPU labels and an empty colorOverrides map. The gpu field is still asserted to contain UNOFFICIAL: <branch> so we don't regress the tooltip provenance.

Checks

  • pnpm typecheck — clean
  • pnpm exec vitest run src/lib/unofficial-merge.test.ts — 13/13 passed (10 original + 3 new)
  • pnpm exec vitest run (whole app) — 1702/1702 passed
  • pnpm exec oxlint packages/app/src/lib/unofficial-merge.ts unofficial-merge.test.ts — 0 errors. The 41 pnpm lint repo-wide errors are pre-existing on master (no-underscore-dangle in files I didn't touch).
  • Browser-verified with Playwright per the swatch dump above
    · branch: claude/huawei-950dt-mindie-unofficial

Drops the branch suffix from synth-key labels and the per-run color
override so unofficial rows promoted via "Show as ingested" share the
same vendor-aware palette as official rows. Two NVIDIA GPUs from one
unofficial run now get distinct shades of green instead of one shared
overlay-palette red. The branch + run URL still surface in `gpu` so the
row tooltip retains provenance.

Co-authored-by: Bryan Shan <Oseltamivir@users.noreply.github.com>
@Oseltamivir

Copy link
Copy Markdown
Contributor Author

@claude

Just for this branch, temporarily set default view to "8k / 256" and "Output Token Throughput per GPU". This default should apply after unofficial runs automatically changes model (introduced in #243 )

@github-actions

github-actions Bot commented May 7, 2026

Copy link
Copy Markdown
Contributor

Claude finished @Oseltamivir's task in 8m 40s —— View job


Done — temp default to 8K / 256 + Output Token Throughput per GPU on unofficial-run load

Pushed 462dc0f to claude/huawei-950dt-mindie-unofficial. When ?unofficialrun= loads and the URL doesn't pin i_seq / i_metric, the dashboard now defaults the sequence to 8K / 256 and the y-axis metric to Output Token Throughput per GPU, alongside the existing model auto-switch from #243.

What changed

  • packages/app/src/lib/unofficial-run-auto-switch.ts — added a sibling pure helper computeUnofficialOverrideDecision(unofficialAvailable, urlValue, lastKey) that mirrors computeAutoSwitchDecision's dedupe pattern: empty overlay → reset key, URL-pinned param → no-op (preserve the same key so a later URL clear can re-fire), same model-set key → no-op, otherwise → fire override and advance key. Same sorted-models key shape as the auto-switch so a sequence-only delta doesn't invalidate a manual user pick.
  • packages/app/src/components/GlobalFilterContext.tsx — added a useEffect that calls the helper with getUrlParam('i_seq') and a lastUnofficialSeqOverrideRef, then setSelectedSequence(Sequence.EightK_256) when shouldOverride is true. Lives next to the existing lastAutoSwitchKeyRef effect so the two are reviewed together.
  • packages/app/src/components/inference/InferenceContext.tsx — analogous effect using getUrlParam('i_metric') + lastUnofficialMetricOverrideRef, calling setSelectedYAxisMetric('y_outputTputPerGpu'). Pulled availableModelsAndSequences out of useUnofficialRun() (already in the same destructure block) so the effect can key off the same overlay set the auto-switch sees.
  • packages/app/src/lib/unofficial-run-auto-switch.test.ts — added 6 unit tests for the new helper: empty overlay resets, fresh fire with no URL pin, URL pin respected, no re-fire after applied, re-arm after overlay clears, sequence-only delta ignored, deterministic key across insertion orders.

Both effects are clearly tagged TEMPORARY (this branch only) so the override is easy to revert.

Browser verification (Playwright on localhost:3000)

  • ?unofficialrun=25457884421 (no pins) — model auto-switched to DeepSeek V4 Pro, sequence dropdown landed on 8K / 256, y-axis on Output Token Throughput per GPU, NON-OFFICIAL banner present.
  • ?unofficialrun=25457884421&i_seq=1k%2F1k&i_metric=y_tpPerGpu — sequence stayed at 1K / 1K, y-axis stayed at Token Throughput per GPU (URL pins respected). Model still auto-switched because g_model wasn't pinned.

Checks

  • pnpm typecheck — clean
  • pnpm --filter @semianalysisai/inferencex-app exec vitest run src/lib/unofficial-run-auto-switch.test.ts — 15/15 passed
  • pnpm --filter @semianalysisai/inferencex-app test:unit — 1773/1773 passed
  • pnpm lint — 0 warnings, 0 errors
  • Browser-verified per the cases above

Notes

  • The 2 console errors observed during the URL-pinned navigation case are pre-existing (a script-tag warning and a SSR/CSR hydration mismatch on SequenceSelector from the existing useState initializer reading the URL on the client — happens for any i_seq= URL pin, regardless of this change). Not introduced by this commit.
  • The override is gated by both the dedupe ref and the URL-pin check, so manual sequence/metric changes (which write back to the in-memory URL store and then to the address bar) are safe — the next render reads the pinned URL value via getUrlParam and bails. Switching to a different unofficial run with a different model set re-arms the override (same behavior as the model auto-switch).
    · branch: claude/huawei-950dt-mindie-unofficial

… on unofficial-run load

Temporary branch-only override: when a `?unofficialrun=` URL loads and the
user did not pin `i_seq` / `i_metric` themselves, default the sequence to
`8K / 256` and the y-axis metric to `Output Token Throughput per GPU` so
InfiniteBench-style sweeps land on a useful default view alongside the
model auto-switch from #243.

Mirrors the dedupe shape of `computeAutoSwitchDecision` via a new pure
helper `computeUnofficialOverrideDecision` so manual user picks stick
once URL-synced and a fresh run-set transition can re-arm the override.

Co-authored-by: Bryan Shan <Oseltamivir@users.noreply.github.com>
@Oseltamivir Oseltamivir changed the title feat(unofficial): support 950dt + huawei-mindie for unofficial runs feat(unofficial): mock 950dt + huawei-mindie for unofficial runs May 26, 2026
Comment thread packages/app/src/components/GlobalFilterContext.tsx Outdated
Comment thread packages/app/src/components/unofficial-run-provider.tsx
Comment thread packages/app/src/components/inference/InferenceContext.tsx
@Oseltamivir

Copy link
Copy Markdown
Contributor Author

Validated the offline B300 TRT path end to end.

  • InferenceX run 27482213487 now uploads canonical results_bmk/agg_bmk.json.
  • /api/unofficial-run?runId=27482213487 returns one B300 / DSV4 / TRT / FP4 / MTP row at 8192/625, concurrency 32.
  • buildChartData creates one E2E and one interactivity point under DeepSeek-V4-Pro_8k/625.
  • Commit 7b4dc87 keeps the branch's 8K/256 preference when a run provides that shape, otherwise it selects the run's actual sequence so the bare unofficial-run URL does not hide this 8K/625 point.

Current preview: https://inferencemax-app-git-claude-huawei-950dt-0f2683-semianalysisai.vercel.app/inference?unofficialrun=27482213487

Validation: full unit suite 2,016 tests passed; typecheck, lint, and format checks passed.

@Oseltamivir

Copy link
Copy Markdown
Contributor Author

@claude

Allow for selection of multiple sequence length options on one graph. (e.g. 1k1k with 8k1k)

@github-actions

github-actions Bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Claude finished @Oseltamivir's task in 25m 15s —— View job


Done — multi-select ISL/OSL on the inference chart

Pushed 747f1ad to claude/huawei-950dt-mindie-unofficial. The sequence picker is now a multi-select — pick 1K/1K and 8K/1K together and both overlay on the same chart, each (GPU, sequence) as its own legend line.

How it works

  • New state lives in InferenceContext: extraSequences: Sequence[] with URL param i_seq_extra. Primary selectedSequence semantics are unchanged — the calculator / evaluation / trends tabs see exactly the same singular value they did before. Only the inference tab opts in.
  • New helper module packages/app/src/lib/sequence-synth-key.ts: builds ${origHwKey}__seq<compact> synth hwKeys (e.g. b200_vllm__seq1k1k). Preserves split('_')[0] so getModelSortIndex / isKnownGpu / getVendor keep working — that's why two NVIDIA GPUs across two sequences still land in the NVIDIA hue band but with distinct shades, instead of needing a separate palette.
  • useChartData filters rows by IN-list, groups by sequence, runs transformBenchmarkRows per group, and rewrites each output point's hwKey with the synth form. Synth HardwareConfig entries get the sequence appended to their label ("B200 — 1K / 1K"). filterByGPU is widened to strip the __seq suffix so the GPU selector still matches across sequences.
  • mergeUnofficialIntoOfficial now iterates over (primary + extras). When multi-sequence is on, the synth key shape composes as base__seq<compact>__uorun<id> — the seq tag precedes the run tag so parseSynthHwKey keeps round-tripping.
  • New MultiSequenceSelector in chart-selectors.tsx (max 3 picks). The singular SequenceSelector is left intact for the calculator's needs.
  • Legend reset effect now keys off extraSequences and mergeAsIngested so newly-introduced synth hwKeys become active when the user flips either toggle — fixes the Cursor "Merged series stay legend-inactive" finding on commit 7b4dc87. (The Cursor "Blind 8k/256 hides unofficial rows" finding is already addressed by selectUnofficialDefaultSequence in the same prior commit.)

Tests

  • sequence-synth-key.test.ts (new, 12 cases) — compact form, round-trip, base prefix preservation, label tagging, __uorun composition, edge cases (unknown compact, missing suffix).
  • unofficial-merge.test.ts (2 new cases) — multi-sequence merge produces base__seq<compact>__uorun<id>; duplicate primary in extras dedups to single-sequence behavior.
  • useChartData.test.ts (1 new case) — filterByGPU strips __seq suffix so canonical GPU picks match across sequences.

Browser verification (Playwright on localhost:3000)

  • /inference → opened the ISL/OSL picker, selected 8K/1K + 1K/1K → both chips show, legend renders e.g. "GB300 NVL72 — 1K / 1K (Dynamo SGLang)" and "GB300 NVL72 — 8K / 1K (Dynamo SGLang)" side by side; chart shows two roofline groups per GPU.
  • URL state round-trips: ?i_seq=8k%2F1k&i_seq_extra=1k%2F1k reloads with both selections active.
  • Unofficial-run overlay composition: ?unofficialrun=27482213487&i_seq_extra=8k%2F1k&i_uoff_ingested=1 — B300 — 8K / 625 (from the run) and B300 — 8K / 1K (official) both appear in the legend simultaneously. The sequence-tag is the inner suffix and the run-tag is the outer suffix, so the existing unofficial-merge code composes cleanly.

Checks

  • pnpm typecheck — clean
  • pnpm exec vitest run1798/1798 passed, including new tests
  • pnpm exec oxlint on touched files — 0 new errors. The 6 reported are pre-existing on master (no-negated-condition / require-unicode-regexp in files I didn't add); verified via git stash && pnpm lint.
  • Browser-verified per the cases above

Notes for review

  • Default is a single sequence — nothing changes for users until they explicitly pick a second one. Backwards-compatible.
  • The seq-synth key shape (base__seq<compact>) shares a delimiter convention with the unofficial-merge synth shape (base__uorun<id>) so they compose; parseSynthHwKey was already structured to peel only the trailing __uorun…, which is what we relied on.
  • Cursor's "Popstate skips ingested toggle sync" finding is out of scope for this turn (separate from multi-sequence) and untouched.
  • Out of scope: I did NOT touch the calculator / evaluation / historical-trends tabs. They keep using singular selectedSequence.
    · branch: claude/huawei-950dt-mindie-unofficial

… one chart

The sequence picker becomes a multi-select. Picking >1 (e.g. 1K/1K + 8K/1K)
overlays both shapes on the same axes — each (GPU, sequence) becomes its own
legend line via a `__seq<compact>` hwKey suffix, so vendor-zone colors still
distinguish hardware while the legend caption reads e.g. "B200 — 1K / 1K"
vs "B200 — 8K / 1K".

State: new `extraSequences: Sequence[]` lives in InferenceContext (URL param
`i_seq_extra`, empty default). Primary `selectedSequence` semantics are
unchanged for the calculator / evaluation / trends tabs.

Pipeline: `useChartData` filters rows by `IN [primary, ...extras]`, groups
by sequence, runs `transformBenchmarkRows` per sequence, and rewrites each
output point's hwKey with `makeSeqSynthKey`. Synth hardware entries get the
sequence appended to their label. `filterByGPU` is widened to strip the
`__seq` suffix so the GPU selector still matches across sequences.

Unofficial-run merge: when `mergeAsIngested` is on with multiple sequences,
the merger iterates over (primary + extras), composing
`base__seq<compact>__uorun<id>` synth keys so a (run, GPU, sequence) triple
gets its own legend line. `parseSynthHwKey` keeps working because the seq
delimiter precedes the run delimiter.

Legend reset effect now also keys off `extraSequences` and `mergeAsIngested`
so newly-introduced synth hwKeys become active when the user flips either
toggle — fixes the Cursor "Merged series stay legend-inactive" finding on
the prior commit where `reconcileActiveSet` only removed stale keys.

Tests:
- `sequence-synth-key.test.ts` (new) — round-trip, base prefix preservation,
  label tagging, compose with `__uorun`.
- `unofficial-merge.test.ts` — multi-sequence merge produces
  `__seq__uorun` shape; primary-only with duplicate extra dedup is a no-op.
- `useChartData.test.ts` — `filterByGPU` strips `__seq` suffix.

Verified in Playwright at localhost:3000/inference: legend renders e.g.
"GB300 NVL72 — 1K / 1K (Dynamo SGLang)" alongside the 8K / 1K variant,
URL state restores from `?i_seq_extra=`, and overlay composition works
with `?unofficialrun=27482213487&i_seq_extra=8k%2F1k&i_uoff_ingested=1`
(B300 — 8K / 625 from the run + B300 — 8K / 1K official both render).

Co-authored-by: Bryan Shan <Oseltamivir@users.noreply.github.com>
Comment thread packages/app/src/components/unofficial-run-provider.tsx
@Oseltamivir

Copy link
Copy Markdown
Contributor Author

@claude merge main in, and resolve or address the PR reviews

@github-actions

github-actions Bot commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Claude finished @Oseltamivir's task in 15m 41s —— View job


Done — merged master + addressed all four Cursor findings

Pushed two commits to claude/huawei-950dt-mindie-unofficial:

  1. 703c6b9 — merge origin/master (91 commits behind, 9 file conflicts resolved).
  2. c9c31a7 — Cursor fixes chore: bump jsdom from 28.1.0 to 29.0.0 #4 and chore: add trailing newline to README #5 + a small lint cleanup.

Cursor findings — resolution status

# Finding Severity Status
1 Blind 8k/256 hides unofficial rows (GlobalFilterContext.tsx) Medium ✅ Already addressed in 7b4dc87selectUnofficialDefaultSequence(unofficialAvailable, selectedModel, urlModel) only returns EightK_256 when it's in the run's available sequences for the target model, otherwise falls back to a sorted available sequence. Verified the merged code still has the same logic.
2 Merged series stay legend-inactive (InferenceContext.tsx:316) High ✅ Already addressed in 747f1ad — the hw-reset key now includes `${extraSequencesKey}
3 Popstate skips ingested toggle sync (unofficial-run-provider.tsx) Low ✅ Fixed in this turn — load now takes an isPopstate flag and re-reads params.get('i_uoff_ingested') on browser back/forward.
4 Merge toggle cleared after URL hydration (unofficial-run-provider.tsx) Medium ✅ Fixed in this turn — both the useState initializer and the hydration useEffect now use readUrlParams() (the in-memory snapshot captured at module load, before url-state.ts's deferred cleanup), so share links with ?i_uoff_ingested=1 no longer race against URL cleanup.

Why a single fix can't cover both popstate (#3) and hydration (#4)

  • On initial mount, the URL may have already been stripped by url-state.ts's setTimeout(0) cleanup, so the live URL is unreliable — we must read the in-memory snapshot.
  • On popstate, the browser has restored the previous history entry, so the live URL is the source of truth — and the snapshot is stale.
  • The load(isPopstate) flag picks the right source per call.

Why I didn't add a unit test for the popstate path

  • The existing unofficial-run-provider.test.ts only exercises the pure helpers (buildChartData, parseAvailableModelsAndSequences). The provider itself isn't rendered there, and adding a render harness for one regression case felt out of scope. The behavior is exercised by browser verification + the logic is small.

Checks

  • pnpm typecheck — clean
  • pnpm lint — 0 errors (also cleaned up 2 pre-existing nits in unofficial-merge.test.ts that I touched)
  • pnpm --filter @semianalysisai/inferencex-app exec vitest run2090/2090 passed
  • pnpm --filter @semianalysisai/inferencex-constants exec vitest run30/30 passed
  • Browser-verified with Playwright that ?unofficialrun=…&i_uoff_ingested=1 hydrates with aria-checked="true" on the scatter-uoff-as-ingested switch, even after url-state.ts runs its deferred URL cleanup. (Did not browser-test popstate end-to-end — sandboxed Playwright + dev mode makes history mutation testing brittle; the popstate code path is small and the logic is straightforward.)

Merge-conflict resolutions (commit 703c6b9)

9 files had textual conflicts — all were "both-sides-added" merges, no semantics had to be reconciled:

  • packages/constants/src/framework-aliases.tshuawei-mindie + mooncake-atom both registered
  • packages/app/cypress/support/mock-data.tshwColorOverrides / extraSequences + compareGpuPair both in mock context
  • packages/app/src/app/api/unofficial-run/route.test.ts — both new test cases preserved
  • packages/app/src/lib/data-mappings.ts — both new imports preserved
  • packages/app/src/components/inference/types.ts — both new context fields preserved
  • packages/app/src/components/inference/hooks/useChartData.tsgetHardwareConfig + hardwareKeyMatchesAnyBase both imported; extraSequences + compareGpuPair both parameters
  • packages/app/src/components/inference/InferenceContext.tsx — switched hasMtpEngineExclusiongetModelExclusion (master refactored exclusion API); kept Sequence import for extraSequences; threaded both effectiveExtraSequences and compareGpuPair into useChartData and context
  • packages/app/src/components/inference/ui/ChartDisplay.tsx — both mergeAsIngested and activeOverlayHwTypes destructured
  • packages/app/src/components/inference/ui/ScatterGraph.tsx — both the "Show as ingested" toggle and master's reshaped log-scale toggle preserved
    · branch: claude/huawei-950dt-mindie-unofficial

github-actions Bot and others added 2 commits June 14, 2026 03:03
…-mindie-unofficial

# Conflicts:
#	packages/app/cypress/support/mock-data.ts
#	packages/app/src/app/api/unofficial-run/route.test.ts
#	packages/app/src/components/inference/InferenceContext.tsx
#	packages/app/src/components/inference/hooks/useChartData.ts
#	packages/app/src/components/inference/types.ts
#	packages/app/src/components/inference/ui/ChartDisplay.tsx
#	packages/app/src/components/inference/ui/ScatterGraph.tsx
#	packages/app/src/lib/data-mappings.ts
#	packages/constants/src/framework-aliases.ts
Two related Cursor findings on the `i_uoff_ingested` URL param:

1. **Merge toggle cleared after URL hydration** — the post-mount sync
   was reading `window.location.search`, but `url-state.ts` strips
   share-link params on a deferred `setTimeout(0)`. If that cleanup ran
   before our effect, a shared link with `i_uoff_ingested=1` would
   silently turn the toggle off. Both the `useState` initializer and the
   hydration effect now read `readUrlParams()` (in-memory snapshot
   captured before cleanup).

2. **Popstate skips ingested toggle sync** — the `popstate` listener
   re-loaded run data but never re-read `i_uoff_ingested`. Browser
   back/forward could leave the toggle out of sync with the URL. The
   load function now takes an `isPopstate` flag and re-syncs the toggle
   from `window.location.search` only on popstate (not on the initial
   call, which would race with the cleanup above).

Also: drop a stale lint nit in `unofficial-merge.test.ts` by switching
the hue-extraction regex to a named-capture-group + unicode flag.

Co-authored-by: Bryan Shan <Oseltamivir@users.noreply.github.com>
@blacksmith-sh

blacksmith-sh Bot commented Jun 14, 2026

Copy link
Copy Markdown

Found 8 test failures on Blacksmith runners:

Failures

Test View Logs
First-load navigation/navigates to comparisons from the header with one click View Logs
Inference ChartControls/renders the sequence selector with the current sequence View Logs
Model Architecture Diagram › Alternating Attention Blocks (MoE model - gpt-oss 120B)/
"before all" hook for "shows MoE and Sink/Full GQA badges for gpt-oss"
View Logs
Model Architecture Diagram › Alternating Attention Blocks (MoE model - gpt-oss 120B)/
"before all" hook for "shows MoE and Sink/Full GQA badges for gpt-oss"
View Logs
Model Architecture Diagram › Collapsible Transformer Blocks (MoE model - MiniMax M3)/
"before all" hook for "shows MoE and GQA badges for MiniMax M3"
View Logs
Model Architecture Diagram › Collapsible Transformer Blocks (MoE model - MiniMax M3)/
"before all" hook for "shows MoE and GQA badges for MiniMax M3"
View Logs
Model Architecture Diagram › Hybrid Attention Blocks (MoE model - DeepSeek V4 Pro)/
"before all" hook for "shows MoE and Hybrid badges for DeepSeek V4 Pro"
View Logs
Model Architecture Diagram › Hybrid Attention Blocks (MoE model - DeepSeek V4 Pro)/
"before all" hook for "shows MoE and Hybrid badges for DeepSeek V4 Pro"
View Logs

Fix with Codesmith
Need help on this PR? Tag /codesmith with what you need.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit c9c31a7. Configure here.

// official series via InferenceContext, so suppress the X-shape overlay layer.
const overlayDataByChartType = useMemo(() => {
if (!unofficialRunInfo || !getOverlayData) {
if (mergeAsIngested || !unofficialRunInfo || !getOverlayData) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overlay ignores extra sequences

Medium Severity

With unofficial overlay mode (not “Show as ingested”), overlay points are loaded only for the primary selectedSequence. Extra ISL/OSL shapes from extraSequences appear in official useChartData series but never in the X-shape overlay, so unofficial points for those sequences are missing on the same chart.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit c9c31a7. Configure here.

const url = new URL(window.location.href);
if (v) url.searchParams.set('i_uoff_ingested', '1');
else url.searchParams.delete('i_uoff_ingested');
window.history.replaceState({}, '', url);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Share URL drops ingested toggle

Medium Severity

Toggling “Show as ingested” updates the address bar via history.replaceState but does not call writeUrlParams, so i_uoff_ingested never enters currentState. buildShareUrl only serializes currentState (plus unofficial run ids from the live URL), so generated share links omit the merge-as-ingested setting even when the toggle is on.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit c9c31a7. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant